MKTG 440
Goal: Make claims about a certain group of people or objects (“population”)
Problem: For cost/time reasons it is impossible to collect data on everyone
Solution: select a subset of the population, i.e. “sample”
Population: The entire group of individuals we are interested in.
Example: All humans, all Tucson residents, all UA students etc.
A parameter is a number describing a characteristic of the population (e.g., population mean)
Sample: The subset of the population we collect data from.
A statistic is a number describing a characteristic of a sample (e.g., sample mean)
How well the sample represents the population depends on the sampling design
Suppose we were interested in surveying current students at UA
What is a sampling frame we could use?
One possibility:
What problems might we run into?
Suppose you want to describe the average income of influencers on a particular platform?
How might we generate a sampling frame for this target population?
How could that sampling frame lead to a biased estimate of income?
Sampling error refers to differences between a sample statistic and a population parameter that result from using a sample instead of a census.
Non-sampling error is error caused by our sampling method or data collection process.
May be due to…
Probability
Non-probability
Each element of the population has the same, known, non-zero probability of inclusion.
Mechanics
Advantages
Disadvantages
Include every k-th element from the population list.
Mechanics
Advantages
Disadvantages
Divide the population into strata, then randomly sample from each stratum.
Mechanics
Advantages
Disadvantages
Divide the population into clusters, then randomly select entire clusters.
Mechanics
Advantages
Disadvantages
Cluster sampling
Stratified sampling
Cost Comparison: If plane tickets + hotels cost $1000, and surveying a person costs $2…
We are interested in surveying adults in US cities.
We want to make sure we have equal coverage across all cities in addition to a representative sample of many different income levels and ethnicities in our data.
How can we do this by combining stratified and cluster sampling?
Stage 1: Geographical clusters (metropolitan area and rural area)
Stage 2: Ethnic and income strata within each geographical cluster
Data are obtained by taking an SRS for each sub-stratum (sub-cluster).
A marketing research firm is hired by a fashion brand to estimate the average monthly spending on clothing by online shoppers.
Convenience sampling: A sample of convenient elements
Advantages
Disadvantages
Judgmental sampling: The elements/subjects are selected based on the judgment of the researcher
Advantages
Disadvantages
Quota sampling: A two-step restricted form of judgmental sampling
| Control characteristic: Gender | Population % | Quota Sample |
|---|---|---|
| Male | 41% | 410 |
| Female | 55% | 550 |
| Non-binary/Other | 4% | 40 |
| 100% | 1,000 |
Advantages
Disadvantages
Snowball sampling: Start with a few participants and ask them to refer additional participants via their social networks.
Advantages
Disadvantages
U.S. Census
Clinical trials
PS: I have included a card to indicate your support for a new Wal-Mart in Soledad. Please take a moment to fill out the card and mail it back to us. Your opinion is important to us.
Results from the survey:
Bias exists if response is NOT independent of outcome variable
Measured Effect = True Effect + Bias
What specific kind of bias here?
Have you come across other examples of possibly biased sampling?
Factors that determine sample sizes with probability sampling:
\[n = \frac{Z^2 \times \sigma^2}{e^2}\]
Goal: Estimate the average satisfaction score (0-10 scale)
\[ n=\frac{Z^2\sigma^2}{e^2} =\frac{(1.96)^2(6.25)}{(0.50)^2} \approx 96.04 \]
Survey at least \(n=97\) customers to estimate the mean satisfaction score within \(\pm 0.50\) points at 95% confidence.
MKTG 440 | Prof. Nolan